Model Selection

INT4 Quantized Inference

# INT4 Quantized Inference

Deepseek R1 0528 Quantized.w4a16

The DeepSeek-R1-0528 model after quantization processing significantly reduces the requirements for GPU memory and disk space by quantizing the weights to the INT4 data type.

Large Language Model

Qwen2.5 VL 3B Instruct Quantized.w4a16

The quantized version of Qwen2.5-VL-3B-Instruct, with weights quantized to INT4 and activations quantized to FP16, designed for efficient vision-text task inference.

Transformers English

Qwen2.5 VL 7B Instruct Quantized.w4a16

Quantized version of Qwen2.5-VL-7B-Instruct, supporting vision-text input and text output, with weights quantized to INT4 and activations to FP16.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase